AITopics

2508.01733

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Liu, Bolun, Lubold, Shane, Raftery, Adrian E., McCormick, Tyler H.

Bayesian Hyperbolic Multidimensional Scaling

arXiv.org Artificial IntelligenceAug-15-2023

Multidimensional scaling (MDS) is a widely used approach to representing high-dimensional, dependent data. MDS works by assigning each observation a location on a low-dimensional geometric manifold, with distance on the manifold representing similarity. We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic. Using hyperbolic space facilitates representing tree-like structures common in many settings (e.g. text or genetic data with hierarchical structure). A Bayesian approach provides regularization that minimizes the impact of measurement error in the observed data and assesses uncertainty. We also propose a case-control likelihood approximation that allows for efficient sampling from the posterior distribution in larger data settings, reducing computational complexity from approximately $O(n^2)$ to $O(n)$. We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data.

artificial intelligence, dissimilarity, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2210.15081

Country:

Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 18:33:27 GMT

Multidimensional Scaling and Data Clustering

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial op(cid:173) timization problems known as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data selection provides a strategy to discover structure in a data set efficiently with partially unknown data.

dissimilarity data, multidimensional scaling and data clustering

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

arXiv.org Artificial IntelligenceSep-27-2020

NN-EVCLUS: Neural Network-based Evidential Clustering

Denoeux, Thierry

Evidential clustering is an approach to clustering based on the use of Dempster-Shafer mass functions to represent cluster-membership uncertainty. In this paper, we introduce a neural-network based evidential clustering algorithm, called NN-EVCLUS, which learns a mapping from attribute vectors to mass functions, in such a way that more similar inputs are mapped to output mass functions with a lower degree of conflict. The neural network can be paired with a one-class support vector machine to make it robust to outliers and allow for novelty detection. The network is trained to minimize the discrepancy between dissimilarities and degrees of conflict for all or some object pairs. Additional terms can be added to the loss function to account for pairwise constraints or labeled data, which can also be used to adapt the metric. Comparative experiments show the superiority of N-EVCLUS over state-of-the-art evidential clustering algorithms for a range of unsupervised and constrained clustering tasks involving both attribute and dissimilarity data.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2009.12795

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJul-2-2014

How Many Dissimilarity/Kernel Self Organizing Map Variants Do We Need?

Rossi, Fabrice

In numerous applicative contexts, data are too rich and too complex to be represented by numerical vectors. A general approach to extend machine learning and data mining techniques to such data is to really on a dissimilarity or on a kernel that measures how different or similar two objects are. This approach has been used to define several variants of the Self Organizing Map (SOM). This paper reviews those variants in using a common set of notations in order to outline differences and similarities between them.

artificial intelligence, machine learning, survey article, (18 more...)

doi: 10.1007/978-3-319-07695-9_1

1407.0611

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre:

Research Report (0.90)
Overview (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.93)

Olteanu, Madalina, Villa-Vialaneix, Nathalie, Cottrell, Marie

On-line relational SOM for dissimilarity data

arXiv.org Machine LearningDec-27-2012

In some applications and in order to address real world situations better, data may be more complex than simple vectors. In some examples, they can be known through their pairwise dissimilarities only. Several variants of the Self Organizing Map algorithm were introduced to generalize the original algorithm to this framework. Whereas median SOM is based on a rough representation of the prototypes, relational SOM allows representing these prototypes by a virtual combination of all elements in the data set. However, this latter approach suffers from two main drawbacks. First, its complexity can be large. Second, only a batch version of this algorithm has been studied so far and it often provides results having a bad topographic organization. In this article, an on-line version of relational SOM is described and justified. The algorithm is tested on several datasets, including categorical data and graphs, and compared with the batch version and with other SOM algorithms for non vector data.

algorithm, artificial intelligence, machine learning, (18 more...)

1212.6316

Country: Europe (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Conan-Guez, Brieuc, Rossi, Fabrice

Dissimilarity Clustering by Hierarchical Multi-Level Refinement

arXiv.org Machine LearningApr-29-2012

We introduce in this paper a new way of optimizing the natural extension of the quantization error using in k-means clustering to dissimilarity data. The proposed method is based on hierarchical clustering analysis combined with multilevel heuristic refinement. The method is computationally efficient and achieves better quantization errors than the relational k-means.

algorithm, artificial intelligence, machine learning, (17 more...)

1204.6509

Country:

Europe (0.47)
North America > United States > New York (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Hofmann, Thomas, Buhmann, Joachim

Multidimensional Scaling and Data Clustering

Neural Information Processing SystemsDec-31-1995

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problems known as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data selection provides a strategy to discover structure in a data set efficiently with partially unknown data. 1 Introduction Grouping experimental data into compact clusters arises as a data analysis problem in psychology, linguistics, genetics and other experimental sciences. The data which are supposed to be clustered are either given by an explicit coordinate representation (central clustering) or, in the non-metric case, they are characterized by dissimilarity values for pairs of data points (pairwise clustering). In this paper we study algorithms (i) for embedding non-metric data in a D-dimensional Euclidian space, (ii) for simultaneous clustering and embedding of non-metric data, and (iii) for active data selection to determine a particular cluster structure with minimal number of data queries. All algorithms are derived from the maximum entropy principle (Hertz et al., 1991) which guarantees robust statistics (Tikochinsky et al., 1984).

algorithm, dissimilarity, multidimensional scaling, (15 more...)

Country:

North America > United States > New York (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Hofmann, Thomas, Buhmann, Joachim

Multidimensional Scaling and Data Clustering

Neural Information Processing SystemsDec-31-1995

algorithm, dissimilarity, multidimensional scaling, (15 more...)

Country:

North America > United States > New York (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Hofmann, Thomas, Buhmann, Joachim

Multidimensional Scaling and Data Clustering

Neural Information Processing SystemsDec-31-1995

Visualizing and structuring pairwise dissimilarity data are difficult combinatorial optimization problemsknown as multidimensional scaling or pairwise data clustering. Algorithms for embedding dissimilarity data set in a Euclidian space, for clustering these data and for actively selecting data to support the clustering process are discussed in the maximum entropy framework. Active data selection provides a strategy to discover structure in a data set efficiently with partially unknown data. 1 Introduction Grouping experimental data into compact clusters arises as a data analysis problem in psychology, linguistics,genetics and other experimental sciences. The data which are supposed to be clustered are either given by an explicit coordinate representation (central clustering) or, in the non-metric case, they are characterized by dissimilarity values for pairs of data points (pairwise clustering). In this paper we study algorithms (i) for embedding non-metric data in a D-dimensional Euclidian space, (ii) for simultaneous clustering and embedding of non-metric data, and (iii) for active data selection to determine a particular cluster structure with minimal number of data queries. All algorithms are derived from the maximum entropy principle (Hertz et al., 1991) which guarantees robust statistics (Tikochinsky et al., 1984).

algorithm, artificial intelligence, machine learning, (17 more...)

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)